On the Representation of Voice Source Aperiodicities in the MBE Speech Coding Model

نویسندگان

  • Preeti Rao
  • Pushkar Patwardhan
چکیده

We present an investigation of the representation of voice source aperiodicities in the Multi-Band Excitation (MBE) speech model for the compression of narrowband speech. The MBE model is a fixed-frame based analysis-synthesis algorithm which combines harmonic and stochastic components to reconstruct speech from estimated model parameters. Pitch cycle perturbations, such as jitter and shimmer, are not captured accurately in the framewise constant parameter estimates, thus impacting the reproduced voice quality. The actual dependence of MBE reconstructed voice quality on the voice pitch and the type of perturbation are explored through objective measurements and subjective listening with synthetic and natural speech.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Voice-based Age and Gender Recognition using Training Generative Sparse Model

Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to genera...

متن کامل

Prediction of Voice Aperiodicity Based on Spectral Representations in HMM Speech Synthesis

In hidden Markov model-based speech synthesis, speech is typically parameterized using source-filter decomposition. A widely used analysis/synthesis framework, STRAIGHT, decomposes the speech waveform into a framewise spectral envelope and a mixed mode excitation signal. Inclusion of an aperiodicity measure in the model enables synthesis also for signals that are not purely voiced or unvoiced. ...

متن کامل

Image Classification via Sparse Representation and Subspace Alignment

Image representation is a crucial problem in image processing where there exist many low-level representations of image, i.e., SIFT, HOG and so on. But there is a missing link across low-level and high-level semantic representations. In fact, traditional machine learning approaches, e.g., non-negative matrix factorization, sparse representation and principle component analysis are employed to d...

متن کامل

Effectiveness of a periodic and aperiodic decomposition method for analysis of voice sources

Decomposition of speech into periodic and aperiodic components is useful in analyzing and describing the characteristics of voice sources. Such a decomposition is also useful in controlling the excitation source for synthesis. This paper addresses the issue of decomposition of speech into periodic and aperiodic components in the context of speech production. The effectiveness of a recently prop...

متن کامل

طراحی یک روش آموزش ناموازی جدید برای تبدیل گفتار با عملکردی بهتر از آموزش موازی

Introduction: The art of voice mimicking by computers, has with the computer have been one of the most challenging topics of speech processing in recent years. The system of voice conversion has two sides. In one side, the speaker is the source that his or her voice has been changed for mimicking the target speaker’s voice (which is on the other side). Two methods of p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003